perm filename A81.TEX[254,RWF] blob sn#877556 filedate 1989-09-21 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00002 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	\input macro[1,mps]
C00007 ENDMK
C⊗;
\input macro[1,mps]
\input macrwf[1,mps]
\magnification\magstephalf
\baselineskip 14pt
\leftline{\sevenrm A81.Tex[254,rwf]\today}
\leftline{\copyright\sevenrm Robert W. Floyd, August 28, 1989}
\leftline{\sevenrm Unpublished draft}
\bigskip
\noindent{\bf The Decomposition Theorem for CFGs}
\medskip
If $x↓1x↓2\ldots x↓n \aRa y$, then $y$ can be decomposed into $y↓1y↓2\ldots
y↓n$ where $x↓i\aRa y↓i$.  All the ideas of the proof are exemplified in the 
following two cases.

Case 1:  $x↓1x↓2 \Rao y$; \hbox{let} $y↓1 = x↓1, y↓2 = x↓2$.

Case 2:  $x↓1x↓2 \Ram y, x↓1 = uAw, A→v, x↓1x↓2 = uAw x↓2\Rightarrow uvw x↓2
\Ramm y$.

By induction on $m$, $uvw \aRa y↓1, x↓2 \aRa y↓2, y = y↓1y↓2$.  Then $x↓1 = uAw 
\Rightarrow uvw \aRa y↓1, x↓2 \aRa y↓2$.  In fact, $x↓1 \Rai y↓1, x↓2 \Raj y↓2,
i + j = m-1,\, \hbox{so}\, \{i, j\} < m$.

If $A \Ram y \in T↑\ast$, (so $A\not= y$, $m > 0)$, in Chomsky form
$A → BC \Ramm y$.  By the decomposition theorem $B \Rai y↓1, C \Raj y↓2$, $y = y↓1
y↓2$, $i < m,\, j < m$.  This allows a convenient form of induction on $m$.

Let $G$ be a {\rm CFG} in Chomsky form, with alphabet $N+T$ and root $S$.  Let
$D$ be a digraph, with each arc bearing a label from $T$.  Let $L↓{ij}$ be
the set of strings labeling paths in $D$ from $i$ to $j$.  We show that
languages $L↓G \cap L↓{pq}$ are {\rm CFLs}.

Let $G'$ have nonterminals $A↓{ij}$, $A\in N$, and root $S↓{pq}$.  For
each $A → BC$ in $G$, and for all modes $i, j, k$ of $D$, put $A↓{ik} →
B↓{ij} C↓{jk}$ into $G$.  If $\langle i, t, j\rangle$ is an arc in $D$
and $A → t$ in $G$, put $A↓{ij} → t$ in $G'$.

By the Chomskian induction principle, assume $A\aRag y\in T↑\ast$ and $y$
labels a path from $i$ to $k$ in $D$.  If $m=1$, $A\tog t = y$, $\mid x\mid
= 1$, $t$ labels an arc $\langle i, t, k\rangle$ and $A↓{ik} \togp t$.

If $m > 1$, $A \mtog BC \aRa y↓1y↓2$, $y↓1$ labels a path from $i$ to $j$,
$y↓2$ labels a path from $j$ to $k$, for some $j$, $B \aRa y$,
$C \aRa y↓2$.  By Chomskian induction, $B↓{ij} \aRa y↓1$, $C↓{jk} \aRagp
y↓2$, so $A↓{ik} \togp B↓{ij} C↓{jk} \aRagp y↓1y↓2 = y$.  This shows, if
$A \aRag x\in T↑\ast$ and $x$ labels a path from $i$ to $k$ in $D$, then
$A↓{ik} \aRagp x$.  The converse is easy and is left to the reader.

This proves:

\noindent{\bf Theorem:} The intersection of a CFL and a FML is a CFL.

\vfill\eject
By a slightly more elaborate construction, we show that the image of a CFL
under a nondeterministic finite-state transduction is a CFL.  
This includes
union and intersection with FMLs, string morphisms and their inverses,
shuffle with FMLs, $\ldots$

[RWF:  Draw figure here]

\vskip 1in
To show PDLs are CFLs, start with the stack language as a CFL.
Transduce into instructions.  Intersect with the control language.  Select the
input/output characters

{\narrower\smallskip\noindent
{\bf (Exercise:} directly construct a CFL from a PDM)\smallskip}

To show CFLs are PDLs, push $S$, pop $A$, push a RHS for $A$.
\bye